Talking Face Generation by Adversarially Disentangled Audio-Visual Representation
نویسندگان
چکیده
منابع مشابه
Audio-visual talking face detection
Talking face detection is important for videoconferencing. However, the detection of the talking face is difficult because of the low resolution of the capturing devices, the informal style of communication and the background sounds. In this paper, we present a novel method for finding the talking face using latent semantic indexing approach. We tested our method on a comprehensive set of home ...
متن کاملAn Audio-Visual Imposture Scenario by Talking Face Animation
With the start of the appearance of PDA’s, handheld PC’s, and mobile telephones that use biometric recognition for user authentication, there is higher demand for automatic non-intrusive voice and face speaker verification systems. Such systems can be embedded in mobile devices to allow biometrically recognized users to sign and send data electronically, and to give their telephone conversation...
متن کاملA Cantonese Speech-Driven Talking Face Using Translingual Audio-to-Visual Conversion
This paper proposes a novel approach towards a videorealistic, speech-driven talking face for Cantonese. We present a technique that realizes a talking face for a target language (Cantonese) using only audio-visual facial recordings for a base language (English). Given a Cantonese speech input, we first use a Cantonese speech recognizer to generate a Cantonese syllable transcription. Then we ma...
متن کاملDisentangled Person Image Generation
Generating novel, yet realistic, images of persons is a challenging task due to the complex interplay between the different image factors, such as the foreground, background and pose information. In this work, we aim at generating such images based on a novel, two-stage reconstruction pipeline that learns a disentangled representation of the aforementioned image factors and generates novel pers...
متن کاملAudio-visual synthesis of talking faces from speech production correlates
This paper presents technical refinements and extensions of our system for correlating audible and visible components of speech behavior and subsequently using those correlates to generate realistic talking faces. Introduction of nonlinear estimation techniques has improved our ability to generate facial motion either from the speech acoustics or from orofacial muscle EMG. Also, preliminary evi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the AAAI Conference on Artificial Intelligence
سال: 2019
ISSN: 2374-3468,2159-5399
DOI: 10.1609/aaai.v33i01.33019299